AITopics | output head

Country: North America > United States > New York > Suffolk County > Stony Brook (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Vision (0.69)
(2 more...)

Tsitsvero, Mikhail, Nakao, Atsuyuki, Ikebata, Hisaki

Accelerating Materials Discovery: Learning a Universal Representation of Chemical Processes for Cross-Domain Property Prediction

arXiv.org Artificial IntelligenceDec-9-2025

Experimental validation of chemical processes is slow and costly, limiting exploration in materials discovery. Machine learning can prioritize promising candidates, but existing data in patents and literature is heterogeneous and difficult to use. We introduce a universal directed-tree process-graph representation that unifies unstructured text, molecular structures, and numeric measurements into a single machine-readable format. To learn from this structured data, we developed a multi-modal graph neural network with a property-conditioned attention mechanism. Trained on approximately 700,000 process graphs from nearly 9,000 diverse documents, our model learns semantically rich embeddings that generalize across domains. When fine-tuned on compact, domain-specific datasets, the pretrained model achieves strong performance, demonstrating that universal process representations learned at scale transfer effectively to specialized prediction tasks with minimal additional data.

artificial intelligence, machine learning, natural language, (19 more...)

2512.05979

Country:

North America > United States (0.46)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Asia > Middle East > Israel > Southern District (0.04)
Asia > China (0.04)

Genre:

Research Report (1.00)
Workflow (0.94)

Industry:

Materials > Chemicals (0.93)
Law (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

arXiv.org Artificial IntelligenceNov-25-2025

M$^2$OE$^2$-GL: A Family of Probabilistic Load Forecasters That Scales to Massive Customers

Li, Haoran, Cheng, Zhe, Guo, Muhao, Weng, Yang, Sun, Yannan, Tran, Victor, Chainaranont, John

Probabilistic load forecasting is widely studied and underpins power system planning, operation, and risk-aware decision making. Deep learning forecasters have shown strong ability to capture complex temporal and contextual patterns, achieving substantial accuracy gains. However, at the scale of thousands or even hundreds of thousands of loads in large distribution feeders, a deployment dilemma emerges: training and maintaining one model per customer is computationally and storage intensive, while using a single global model ignores distributional shifts across customer types, locations, and phases. Prior work typically focuses on single-load forecasters, global models across multiple loads, or adaptive/personalized models for relatively small settings, and rarely addresses the combined challenges of heterogeneity and scalability in large feeders. We propose M2OE2-GL, a global-to-local extension of the M2OE2 probabilistic forecaster. We first pretrain a single global M2OE2 base model across all feeder loads, then apply lightweight fine-tuning to derive a compact family of group-specific forecasters. Evaluated on realistic utility data, M2OE2-GL yields substantial error reductions while remaining scalable to very large numbers of loads.

artificial intelligence, forecasting, machine learning, (17 more...)

2511.17623

Country:

North America > United States > Texas > Dallas County > Dallas (0.04)
North America > United States > Arizona (0.04)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)

Genre: Research Report (0.40)

Industry: Energy > Power Industry (0.95)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceNov-18-2025

LAYA: Layer-wise Attention Aggregation for Interpretable Depth-Aware Neural Networks

Vessio, Gennaro

Deep neural networks typically rely on the representation produced by their final hidden layer to make predictions, implicitly assuming that this single vector fully captures the semantics encoded across all preceding transformations. However, intermediate layers contain rich and complementary information -- ranging from low-level patterns to high-level abstractions -- that is often discarded when the decision head depends solely on the last representation. This paper revisits the role of the output layer and introduces LAYA (Layer-wise Attention Aggregator), a novel output head that dynamically aggregates internal representations through attention. Instead of projecting only the deepest embedding, LAYA learns input-conditioned attention weights over layer-wise features, yielding an interpretable and architecture-agnostic mechanism for synthesizing predictions. Experiments on vision and language benchmarks show that LAYA consistently matches or improves the performance of standard output heads, with relative gains of up to about one percentage point in accuracy, while providing explicit layer-attribution scores that reveal how different abstraction levels contribute to each decision. Crucially, these interpretability signals emerge directly from the model's computation, without any external post hoc explanations. The code to reproduce LAYA is publicly available at: https://github.com/gvessio/LAYA.

artificial intelligence, machine learning, representation, (17 more...)

2511.12723

Country: Europe > Italy > Apulia > Bari (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsOct-10-2025, 00:37:08 GMT

42770daf4a3384b712ea9c36e9279998-Paper-Conference.pdf

prediction, query, segmentation, (17 more...)

Country: North America > United States > New York > Suffolk County > Stony Brook (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Vision (0.69)
(2 more...)

Neural Information Processing SystemsOct-2-2025, 12:47:43 GMT

Learning Conjoint Attentions for Graph Neural Nets Supplementary Materials Tiantian He1,2 Y ew-Soon Ong 1,2 Lu Bai 1,2 1 Agency for Science, Technology and Research (A*ST AR) 2

To prove Theorem 1, we need to consider the two directions of the iff conditions. Obviously, the above equation does not hold as the terms in the summation operator are positive. Eq. (4), we have: null However, the RHS of Eq. (10) can be an RHS is an irrational number, while LHS is a rational number. Eq. (9) can be rewritten as: null To prove Theorem 2, we can follow the procedure which is used to prove Theorem 1. As Eq. (20) holds for any Obviously, the above equation does not hold as softmax function is positive.

artificial intelligence, exp, machine learning, (16 more...)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)

Neural Information Processing SystemsSep-25-2025, 14:02:52 GMT

25cd345233c65fac1fec0ce61d0f7836-Supplemental-Datasets_and_Benchmarks_Track.pdf

artificial intelligence, deep learning, machine learning, (18 more...)

Country: North America > United States (0.46)

Genre: Research Report > Experimental Study (0.46)

Industry:

Materials > Chemicals > Industrial Gases > Liquified Gas (1.00)
Materials > Chemicals > Commodity Chemicals > Petrochemicals > LNG (1.00)
Energy > Oil & Gas > Midstream (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceSep-23-2025

L-MTP: Leap Multi-Token Prediction Beyond Adjacent Context for Large Language Models

Liu, Xiaohao, Xia, Xiaobo, Zhao, Weixiang, Zhang, Manyi, Yu, Xianzhi, Su, Xiu, Yang, Shuo, Ng, See-Kiong, Chua, Tat-Seng

Large language models (LLMs) have achieved notable progress. Despite their success, next-token prediction (NTP), the dominant method for LLM training and inference, is constrained in both contextual coverage and inference efficiency due to its inherently sequential process. To overcome these challenges, we propose leap multi-token prediction (L-MTP), an innovative token prediction method that extends the capabilities of multi-token prediction (MTP) by introducing a leap-based mechanism. Unlike conventional MTP, which generates multiple tokens at adjacent positions, L-MTP strategically skips over intermediate tokens, predicting non-sequential ones in a single forward pass. This structured leap not only enhances the model's ability to capture long-range dependencies but also enables a decoding strategy specially optimized for non-sequential leap token generation, effectively accelerating inference. We theoretically demonstrate the benefit of L-MTP in improving inference efficiency.

arxiv preprint arxiv, large language model, machine learning, (18 more...)